Modern Greek Corpus Taxonomy
نویسندگان
چکیده
$EVWUDFW 7KH DLP RI WKLV SDSHU LV WR H[SORUH WKH ZD\ LQ ZKLFK GLIIHUHQW NLQG RI OLQJXLVWLF YDULDEOHV FDQ EH XVHG LQ RUGHU WR GLVFULPLQDWH WH[W W\SH LQ SUHFODVVLILHG SUHVV WH[WV 0RGHUQ *UHHN 0* ODQJXDJH GXH WR LWV SDVW GLJORVVLF VWDWXV H[KLELWV H[WHQGHG YDULDWLRQ LQ ZULWWHQ WH[WV DFURVV DOO OLQJXLVWLF OHYHOV DQG FDQ EH H[SORLWHG LQ WH[W FDWHJRUL]DWLRQ WDVNV 7KH UHVHDUFK SUHVHQWHG XVHG 'LVFULPLQDQW )XQFWLRQ $QDO\VLV ')$ DV D WH[W FDWHJRUL]DWLRQ PHWKRG DQG H[SORUHV WKH ZD\ GLIIHUHQW YDULDEOH JURXSV FRQWULEXWH WR WKH WH[W W\SH GLVFULPLQDWLRQ
منابع مشابه
Ensemble Learning of Economic Taxonomy Relations from Modern Greek Corpora
This paper proposes the use of ensemble learning for the identification of taxonomic relations between Modern Greek economic terms. Unlike previous approaches, apart from is-a and part-of relations, the present work deals also with relation types that are characteristic of the economic domain. Semantic and syntactic information governing the term pairs is encoded in a novel feature-vector repre...
متن کاملTypological Perspectives on Modern Greek
Greek is one of the more intensely-studied languages in the world, with regard to its history, structure, and social setting, but despite this special place that Greek holds in the pantheon of human languages, the modern form of the language has played a relatively minor role in linguistic studies aimed at developing a general typology of natural language, i.e. the development of a taxonomy of ...
متن کاملDevelopment of a Modern Greek Broadcast-News Corpus and Speech Recognition System
We report on the creation of a Modern Greek broadcast-news corpus as a pre-requisite to build a large-vocabulary continuous-speech recognition system. We discuss lexical modelling with respect to pronuciation generation and examine the effects of the lexicon size on word accuracies. Peculiarities of Modern Greek as a highly inflectional language and their challenges for speech recognition are d...
متن کاملQuantitative parameters in corpus design: Estimating the optimum text size in Modern Greek language
The aim of this paper is to investigate the major quantitative parameters related to the definition of the optimum text size in Modern Greek corpus development. Using the Hellenic National Corpus (HNC) (Hatzigeorgiu et al., 2000) as a reference point we estimated a number of critical statistical measures regarding feature counting in different text sizes. The results indicate that frequent ling...
متن کاملA Tool for Multi-Word Expression Extraction in Modern Greek Using Syntactic Parsing
This paper presents a tool for extracting multi-word expressions from corpora in Modern Greek, which is used together with a parallel concordancer to augment the lexicon of a rule-based machinetranslation system. The tool is part of a larger extraction system that relies, in turn, on a multilingual parser developed over the past decade in our laboratory. The paper reviews the various NLP module...
متن کاملVergina: A Modern Greek Speech Database for Speech Synthesis
The present paper outlines the Vergina speech database, which was developed in support of research and development of corpus-based unit selection and statistical parametric speech synthesis systems for Modern Greek language. In the following, we describe the design, development and implementation of the recording campaign, as well as the annotation of the database. Specifically, a text corpus o...
متن کامل